Acceleration of Block Matching on a Low-Power Heterogeneous Multi-Core Processor Based on DTU Data-Transfer with Data Re-Allocation

نویسندگان

  • Yoshitaka Hiramatsu
  • Hasitha Muthumala Waidyasooriya
  • Masanori Hariyama
  • Tohru Nojiri
  • Kunio Uchiyama
  • Michitaka Kameyama
چکیده

The large data-transfer time among different cores is a big problem in heterogeneous multi-core processors. This paper presents a method to accelerate the data transfers exploiting data-transfer-units together with complex memory allocation. We used block matching, which is very common in image processing, to evaluate our technique. The proposed method reduces the data-transfer time by more than 42% compared to the earlier works that use CPU-based data transfers. Moreover, the total processing time is only 15 ms for a VGA image with 16 × 16 pixel blocks. key words: block matching, heterogeneous multi-core, dynamically reconfigurable processor, data transfer, accelerator

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Task Allocation with Algorithm Transformation for Reducing Data-Transfer Bottlenecks in Heterogeneous Multi-Core Processors: A Case Study of HOG Descriptor Computation

Heterogeneous multi-core processors are attracted by the media processing applications due to their capability of drawing strengths of different cores to improve the overall performance. However, the data transfer bottlenecks and limitations in the task allocation due to the accelerator-incompatible operations prevents us from gaining full potential of the heterogeneous multi-core processors. T...

متن کامل

Mapping for a Heterogeneous Multi-Core Media Processor Considering the Data Transfer Time

Heterogeneous multi-core processors are at tracted by the media processing applications due to their capability ofdrawing strengths ofdifferent cores to improve the overall performance. However, the data transfer bottle necks and limitations in the task allocation have preventedus from gainingfull potential of the heterogeneous multi-core processors. This paper presents a task allocation method...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

A 45-nm 37.3 GOPS/W Heterogeneous Multi-Core SOC with 16/32 Bit Instruction-Set General-Purpose Core

We built a 12.4 mm × 12.4 mm, 45-nm CMOS, chip that integrates eight 648-MHz general purpose cores, two matrix processor (MX-2) cores, four flexible engine (FE) cores and media IP (VPU5) to establish heterogeneous multi-core chip architecture. The general purpose core had its IPC (instructions per cycle) performance enhanced by adding 32-bit instructions to the existing 16-bit fixed-length inst...

متن کامل

Data-Transfer-Aware Memory Allocation for Dynamically Reconfigurable Accelerators in Heterogeneous Multicore Processors

Accelerator cores in low-power heterogeneous multicore processors have multiple memory modules to enable parallel data access. Recent low-power processors contain address generation units (AGUs) for fast address generation. To reduce the core-area, small functional units such as adders and counters are used in AGUs. Such small functional units make it difficult to implement complex addressing p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 95-C  شماره 

صفحات  -

تاریخ انتشار 2012